Scalable Algorithms for Tractable Schatten Quasi-Norm Minimization

نویسندگان

  • Fanhua Shang
  • Yuanyuan Liu
  • James Cheng
چکیده

The Schatten-p quasi-norm (0<p<1) is usually used to replace the standard nuclear norm in order to approximate the rank function more accurately. However, existing Schattenp quasi-norm minimization algorithms involve singular value decomposition (SVD) or eigenvalue decomposition (EVD) in each iteration, and thus may become very slow and impractical for large-scale problems. In this paper, we first define two tractable Schatten quasi-norms, i.e., the Frobenius/nuclear hybrid and bi-nuclear quasi-norms, and then prove that they are in essence the Schatten-2/3 and 1/2 quasi-norms, respectively, which lead to the design of very efficient algorithms that only need to update two much smaller factor matrices. We also design two efficient proximal alternating linearized minimization algorithms for solving representative matrix completion problems. Finally, we provide the global convergence and performance guarantees for our algorithms, which have better convergence properties than existing algorithms. Experimental results on synthetic and real-world data show that our algorithms are more accurate than the state-ofthe-art methods, and are orders of magnitude faster. Introduction In recent years, the matrix rank minimization problem arises in a wide range of applications such as matrix completion, robust principal component analysis, low-rank representation, multivariate regression and multi-task learning. To solve such problems, Fazel, Hindi, and Boyd; Candès and Tao; Recht, Fazel, and Parrilo (2001; 2010; 2010) have suggested to relax the rank function by its convex envelope, i.e., the nuclear norm. In fact, the nuclear norm is equivalent to the l1-norm on singular values of a matrix, and thus it promotes a low-rank solution. However, it has been shown in (Fan and Li 2001) that the l1-norm regularization over-penalizes large entries of vectors, and results in a biased solution. By realizing the intimate relationship between them, the nuclear norm penalty also over-penalizes large singular values, that is, it may make the solution deviate from the original solution as the l1-norm does (Nie, Huang, and Ding 2012; Lu et al. 2015). Compared with the nuclear norm, the Schatten-p quasi-norm for 0 < p < 1 makes a closer Copyright c © 2016, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. approximation to the rank function. Consequently, the Schatten-p quasi-norm minimization has attracted a great deal of attention in images recovery (Lu and Zhang 2014; Lu et al. 2014), collaborative filtering (Nie et al. 2012; Lu et al. 2015; Mohan and Fazel 2012) and MRI analysis (Majumdar and Ward 2011). In addition, many non-convex surrogate functions of the l0-norm listed in (Lu et al. 2014; Lu et al. 2015) have been extended to approximate the rank function, such as SCAD (Fan and Li 2001) and MCP (Zhang 2010). All non-convex surrogate functions mentioned above for low-rank minimization lead to some non-convex, nonsmooth, even non-Lipschitz optimization problems. Therefore, it is crucial to develop fast and scalable algorithms which are specialized to solve some alternative formulations. So far, Lai, Xu, and Yin (2013) proposed an iterative reweighted lease squares (IRucLq) algorithm to approximate the Schatten-p quasi-norm minimization problem, and proved that the limit point of any convergent subsequence generated by their algorithm is a critical point. Moreover, Lu et al. (2014) proposed an iteratively reweighted nuclear norm (IRNN) algorithm to solve many non-convex surrogate minimization problems. For matrix completion problems, the Schatten-p quasi-norm has been shown to be empirically superior to the nuclear norm (Marjanovic and Solo 2012). In addition, Zhang, Huang, and Zhang (2013) theoretically proved that the Schatten-p quasi-norm minimization with small p requires significantly fewer measurements than the convex nuclear norm minimization. However, all existing algorithms have to be solved iteratively and involve SVD or EVD in each iteration, which incurs high computational cost and is too expensive for solving large-scale problems (Cai and Osher 2013; Liu et al. 2014). In contrast, as an alternative non-convex formulation of the nuclear norm, the bilinear spectral regularization as in (Srebro, Rennie, and Jaakkola 2004; Recht, Fazel, and Parrilo 2010) has been successfully applied in many large-scale applications, e.g., collaborative filtering (Mitra, Sheorey, and Chellappa 2010). As the Schatten-p quasi-norm is equivalent to the lp quasi-norm on singular values of a matrix, it is natural to ask the following question: can we design equivalent matrix factorization forms for the cases of the Schatten quasi-norm, e.g., p = 2/3 or 1/2? In order to answer the above question, in this paper we first define two tractable Schatten quasi-norms, i.e., the Frobenius/nuclear hybrid and bi-nuclear quasi-norms. We then prove that they are in essence the Schatten-2/3 and 1/2 quasi-norms, respectively, for solving whose minimization we only need to perform SVDs on two much smaller factor matrices as contrary to the larger ones used in existing algorithms, e.g., IRNN. Therefore, our method is particularly useful for many “big data” applications that need to deal with large, high dimensional data with missing values. To the best of our knowledge, this is the first paper to scale Schatten quasi-norm solvers to the Netflix dataset. Moreover, we provide the global convergence and recovery performance guarantees for our algorithms. In other words, this is the best guaranteed convergence for algorithms that solve such challenging problems. Notations and Background The Schatten-p norm (0 < p < ∞) of a matrix X ∈ Rm×n (m ≥ n) is defined as

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tractable and Scalable Schatten Quasi-Norm Approximations for Rank Minimization

The Schatten quasi-norm was introduced tobridge the gap between the trace norm andrank function. However, existing algorithmsare too slow or even impractical for large-scale problems. Motivated by the equivalencerelation between the trace norm and its bilin-ear spectral penalty, we define two tractableSchatten norms, i.e. the bi-trace and tri-tracenorms, and prov...

متن کامل

Unified Scalable Equivalent Formulations for Schatten Quasi-Norms

The Schatten quasi-norm can be used to bridge the gap between the nuclear norm and rank function. However, most existing algorithms are too slow or even impractical for large-scale problems, due to the singular value decomposition (SVD) or eigenvalue decomposition (EVD) of the whole matrix in each iteration. In this paper, we rigorously prove that for any 0< p≤ 1, p1, p2 > 0 satisfying 1/p= 1/p...

متن کامل

Performance Guarantees for Schatten-$p$ Quasi-Norm Minimization in Recovery of Low-Rank Matrices

We address some theoretical guarantees for Schatten-p quasi-norm minimization (p ∈ (0, 1]) in recovering low-rank matrices from compressed linear measurements. Firstly, using null space properties of the measuring operator, we provide a sufficient condition for exact recovery of low-rank matrices. This condition guarantees unique recovery of matrices of ranks equal or larger than what is guaran...

متن کامل

Supplementary Materials for “ Tractable and Scalable Schatten Quasi - Norm Approximations for Rank Minimization ”

In this supplementary material, we give the detailed proofs of some lemmas, properties and theorems, as well as some additional experimental results on synthetic data and four recommendation system data sets. A More Notations R n denotes the n-dimensional Euclidean space, and the set of all m×n matrices with real entries is denoted by R m×n. Given matrices X and Y ∈ R m×n , the inner product is...

متن کامل

A Unified Convex Surrogate for the Schatten-p Norm

The Schatten-p norm (0 < p < 1) has been widely used to replace the nuclear norm for better approximating the rank function. However, existing methods are either 1) not scalable for large scale problems due to relying on singular value decomposition (SVD) in every iteration, or 2) specific to some p values, e.g., 1/2, and 2/3. In this paper, we show that for any p, p1, and p2 > 0 satisfying 1/p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016